NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Cross-validation for training and testing co-occurrence network inference algorithms

https://doi.org/10.1186/s12859-025-06083-7

Agyapong, Daniel; Propster, Jeffrey Ryan; Marks, Jane; Hocking, Toby Dylan (December 2025, BMC Bioinformatics)

Abstract <bold>Background</bold>Microorganisms are found in almost every environment, including soil, water, air and inside other organisms, such as animals and plants. While some microorganisms cause diseases, most of them help in biological processes such as decomposition, fermentation and nutrient cycling. Much research has been conducted on the study of microbial communities in various environments and how their interactions and relationships can provide insight into various diseases. Co-occurrence network inference algorithms help us understand the complex associations of micro-organisms, especially bacteria. Existing network inference algorithms employ techniques such as correlation, regularized linear regression, and conditional dependence, which have different hyper-parameters that determine the sparsity of the network. These complex microbial communities form intricate ecological networks that are fundamental to ecosystem functioning and host health. Understanding these networks is crucial for developing targeted interventions in both environmental and clinical settings. The emergence of high-throughput sequencing technologies has generated unprecedented amounts of microbiome data, necessitating robust computational methods for network inference and validation. <bold>Results</bold>Previous methods for evaluating the quality of the inferred network include using external data, and network consistency across sub-samples, both of which have several drawbacks that limit their applicability in real microbiome composition data sets. We propose a novel cross-validation method to evaluate co-occurrence network inference algorithms, and new methods for applying existing algorithms to predict on test data. Our method demonstrates superior performance in handling compositional data and addressing the challenges of high dimensionality and sparsity inherent in real microbiome datasets. The proposed framework also provides robust estimates of network stability. <bold>Conclusions</bold>Our empirical study shows that the proposed cross-validation method is useful for hyper-parameter selection (training) and comparing the quality of inferred networks between different algorithms (testing). This advancement represents a significant step forward in microbiome network analysis, providing researchers with a reliable tool for understanding complex microbial interactions. The method’s applicability extends beyond microbiome studies to other fields where network inference from high-dimensional compositional data is crucial, such as gene regulatory networks and ecological food webs. Our framework establishes a new standard for validation in network inference, potentially accelerating discoveries in microbial ecology and human health.
more » « less
Free, publicly-accessible full text available December 1, 2026
Predicting Neuromuscular Engagement to Improve Gait Training With a Robotic Ankle Exoskeleton

https://doi.org/10.1109/LRA.2023.3291919

Harshe, Karl; Williams, Jack R.; Hocking, Toby D.; Lerner, Zachary F. (August 2023, IEEE Robotics and Automation Letters)

Full Text Available
Chatbots Language Design: The Influence of Language Variation on User Experience with Tourist Assistant Chatbots

https://doi.org/10.1145/3487193

Chaves, Ana Paula; Egbert, Jesse; Hocking, Toby; Doerry, Eck; Gerosa, Marco Aurelio (April 2022, ACM Transactions on Computer-Human Interaction)

Chatbots are often designed to mimic social roles attributed to humans. However, little is known about the impact of using language that fails to conform to the associated social role. Our research draws on sociolinguistic to investigate how a chatbot’s language choices can adhere to the expected social role the agent performs within a context. We seek to understand whether chatbots design should account for linguistic register. This research analyzes how register differences play a role in shaping the user’s perception of the human-chatbot interaction. We produced parallel corpora of conversations in the tourism domain with similar content and varying register characteristics and evaluated users’ preferences of chatbot’s linguistic choices in terms of appropriateness, credibility, and user experience. Our results show that register characteristics are strong predictors of user’s preferences, which points to the needs of designing chatbots with register-appropriate language to improve acceptance and users’ perceptions of chatbot interactions.
more » « less
Full Text Available
SPARSEMODr: Rapidly simulate spatially explicit and stochastic models of COVID-19 and other infectious diseases

https://doi.org/10.1093/biomethods/bpac022

Mihaljevic, Joseph R.; Borkovec, Seth; Ratnavale, Saikanth; Hocking, Toby D.; Banister, Kelsey E.; Eppinger, Joseph E.; Hepp, Crystal; Doerry, Eck (September 2022, Biology Methods and Protocols)

Abstract Building realistically complex models of infectious disease transmission that are relevant for informing public health is conceptually challenging and requires knowledge of coding architecture that can implement key modeling conventions. For example, many of the models built to understand COVID-19 dynamics have included stochasticity, transmission dynamics that change throughout the epidemic due to changes in host behavior or public health interventions, and spatial structures that account for important spatio-temporal heterogeneities. Here we introduce an R package, SPARSEMODr, that allows users to simulate disease models that are stochastic and spatially explicit, including a model for COVID-19 that was useful in the early phases of the epidemic. SPARSEMOD stands for SPAtial Resolution-SEnsitive Models of Outbreak Dynamics, and our goal is to demonstrate particular conventions for rapidly simulating the dynamics of more complex, spatial models of infectious disease. In this report, we outline the features and workflows of our software package that allow for user-customized simulations. We believe the example models provided in our package will be useful in educational settings, as the coding conventions are adaptable, and will help new modelers to better understand important assumptions that were built into sophisticated COVID-19 models.
more » « less
A greedy graph search algorithm based on changepoint analysis for automatic QRS complex detection

https://doi.org/10.1016/j.compbiomed.2021.104208

Fotoohinasab, Atiyeh; Hocking, Toby; Afghah, Fatemeh (March 2021, Computers in Biology and Medicine)
null (Ed.)
Full Text Available
A Graph-constrained Changepoint Detection Approach for ECG Segmentation

https://doi.org/10.1109/EMBC44109.2020.9175333

Fotoohinasab, Atiyeh; Hocking, Toby; Afghah, Fatemeh (July 2020, 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC))
null (Ed.)
Electrocardiogram (ECG) signal is the most commonly used non-invasive tool in the assessment of cardiovascular diseases. Segmentation of the ECG signal to locate its constitutive waves, in particular the R-peaks, is a key step in ECG processing and analysis. Over the years, several segmentation and QRS complex detection algorithms have been proposed with different features; however, their performance highly depends on applying preprocessing steps which makes them unreliable in realtime data analysis of ambulatory care settings and remote monitoring systems, where the collected data is highly noisy. Moreover, some issues still remain with the current algorithms in regard to the diverse morphological categories for the ECG signal and their high computation cost. In this paper, we introduce a novel graph-based optimal changepoint detection (GCCD) method for reliable detection of Rpeak positions without employing any preprocessing step. The proposed model guarantees to compute the globally optimal changepoint detection solution. It is also generic in nature and can be applied to other time-series biomedical signals. Based on the MIT-BIH arrhythmia (MIT-BIH-AR) database, the proposed method achieves overall sensitivity Sen = 99.76, positive predictivity PPR = 99.68, and detection error rate DER = 0.55 which are comparable to other state-of-the-art approaches.
more » « less
Full Text Available
A Graph-Constrained Changepoint Learning Approach for Automatic QRS-Complex Detection

Fotoohinasab, Atiyeh; Hocking, Toby; Afghah, Fatemeh (January 2020, IEEE Asilomar Conference on Signals, Systems, and Computers ASILOMAR)
null (Ed.)
This study presents a new viewpoint on ECG signal analysis by applying a graph-based changepoint detection model to locate R-peak positions. This model is based on a new graph learning algorithm to learn the constraint graph given the labeled ECG data. The proposed learning algorithm starts with a simple initial graph and iteratively edits the graph so that the final graph has the maximum accuracy in R-peak detection. We evaluate the performance of the algorithm on the MIT-BIH Arrhythmia Database. The evaluation results demonstrate that the proposed method can obtain comparable results to other state-of-the-art approaches. The proposed method achieves the overall sensitivity of Sen = 99.64%, positive predictivity of PPR = 99.71%, and detection error rate of DER = 0.19.
more » « less
Full Text Available
Microbial carbon use efficiency promotes global soil carbon storage

https://doi.org/10.1038/s41586-023-06042-3

Tao, Feng; Huang, Yuanyuan; Hungate, Bruce A.; Manzoni, Stefano; Frey, Serita D.; Schmidt, Michael W.; Reichstein, Markus; Carvalhais, Nuno; Ciais, Philippe; Jiang, Lifen; et al (June 2023, Nature)

Abstract Soils store more carbon than other terrestrial ecosystems 1,2 . How soil organic carbon (SOC) forms and persists remains uncertain 1,3 , which makes it challenging to understand how it will respond to climatic change 3,4 . It has been suggested that soil microorganisms play an important role in SOC formation, preservation and loss 5–7 . Although microorganisms affect the accumulation and loss of soil organic matter through many pathways 4,6,8–11 , microbial carbon use efficiency (CUE) is an integrative metric that can capture the balance of these processes 12,13 . Although CUE has the potential to act as a predictor of variation in SOC storage, the role of CUE in SOC persistence remains unresolved 7,14,15 . Here we examine the relationship between CUE and the preservation of SOC, and interactions with climate, vegetation and edaphic properties, using a combination of global-scale datasets, a microbial-process explicit model, data assimilation, deep learning and meta-analysis. We find that CUE is at least four times as important as other evaluated factors, such as carbon input, decomposition or vertical transport, in determining SOC storage and its spatial variation across the globe. In addition, CUE shows a positive correlation with SOC content. Our findings point to microbial CUE as a major determinant of global SOC storage. Understanding the microbial processes underlying CUE and their environmental dependence may help the prediction of SOC feedback to a changing climate.
more » « less
Full Text Available

Search for: All records